Picture for Wangze Ni

Wangze Ni

TravelEval: A Comprehensive Benchmarking Framework for Evaluating LLM-Powered Travel Planning Agents

Add code
May 31, 2026
Viaarxiv icon

When AI reviews science: Can we trust the referee?

Add code
Apr 26, 2026
Viaarxiv icon

DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization

Add code
Apr 21, 2025
Viaarxiv icon